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Introduction 

This  project  focused  on  the  development  of  data  analytic  and  statistical  methodology  for  data  sets 
which  may  be  characterized  by  one  or  more  of  the  properties  that  they  are  large  in  size,  high  in  dimension  and 
nonhomogeneous.  A  major  thrust  is  in  the  visualization  of  both  data  point  clouds  and  mathematical  structures 
in  high  dimensions.  Several  techniques  were  proposed  including  parallel  coordinate  density  plots,  3- 
dimensional  Andrews  plots,  grand  (or  guided)  tours  in  3  or  higher  dimensions.  A  combination  of 
mathematical  analysis  and  graphics  display  is  our  basic  approach  to  these  visuahzation  problems.  A  closely 
related  area  is  the  area  of  structural  inference  for  high  dimensional  structures.  By  this  is  meant  the  estimation 
of  solid  structures  including  k-dimensional  flats  in  n-dimensional  space  as  well  as  other  nonlinear  manifolds  in 
high  dimensional  space.  Proposed  techniques  involved  1.  the  detection  and  estimation  of  k-flats,  thick  k-flats 
and  nonlinear  manifolds  of  modest  curvature  by  exploitation  of  the  projective  duality  for  parallel  coordinates 
and  2.  the  estimation  of  more  severely  curved  manifolds  by  use  of  ridges  on  k-dimensional  density  estimates. 
The  parallel  coordinate  projective  duality  is  that  in  parallel  coordinates  lines  are  represented  by  points  and  vice 
versa.  Since  k  linearly  ind^ndent  lines  are  sufficient  to  uniquely  specify  a  k-flat,  it  appeared  to  be  possible  to 
identify  and  aibitrarily  oriented  k-flat  in  n-space  by  appropriately  exploiting  parallel  coordinates. 

We  pri^xised  to  focus  on  several  aspects  of  computational  statistics.  The  mam  focus  was  the 
development  of  methods  for  the  visualization  of  multidimensional  structure.  The  visualization  of 
multidimensional  structure  is  a  key  element  in  exploratory  analysis  of  high  dimensional  data,  but,  of  course, 
with  much  broader  spinoff  in  terms  of  other  scientific  areas.  We  suggested  four  research  tr^ics  related  to  the 
visualization;  1.  Three-Dimensional  Andrews  and  Related  Plots,  2.  The  Grand  Tour  in  Three  Diiwiisions,  3. 
Finding  Structure  in  k-Dimensions  Using  Grand  Tour  and  Parallel  Coordinates,  and  4.  Structural  Inference 
using  Ridge  Estimation  in  Hyperspace. 

Three-Dimenskuial  Andrews  and  Related  Plots 

An  Andrews  plot  is  a  multidimensional  plotting  device  that  is  somewhat  related  to  the  parallel 
coordinate  methodology.  There  are  several  concepfoal  viewpoints  that  can  be  described  in  connection  with 

Andrews  plots.  First  of  all  think  of  a  data  vector  (Xi,.. .  ,X„)  as  represented  by  pairs  of  the  form  (1,  XJ, .. .  , 

(n,  Xn).  One  way  of  think  of  the  parallel  coordinate  plot  is  as  a  linear  interpolation  between  tlKse  points.  The 
reason  for  using  a  lirKar  interpolation  is  that  the  transformation  fiom  Cartesian  space  to  parallel  coordinate 
space  is  a  projective  transformation  and,  thus,  leads  to  an  elegant  geometric  interpretation  of  mathematical 
structure.  In  particular,  we  can  map  Cartesian  geometric  structures  into  parallel  coordinate  geometric 
structures.  However,  other  general  sets  of  interpolations  may  be  suggested.  The  earliest  one  is  essentially  a 
Fourier  interpolation.  That  is,  plot  a  multidimensional  vector  as  a  trigonometric  polynomial  expansion  with 
coefficients  determined  by  the  weights  Xj.  Specifically  Andrews  suggests  plotting 

=  Xi/y/I +^2  sin(e)  ^XscosfS)  +XiSin  (29)  -\-X^cos(29)  +  •  •  • 

Each  unique  point  gets  mapped  into  a  uiuque  trigononKtric  polynomial.  These  are  then  plotted  in  a  way  similar 
to  parallel  coordinate  plots.  Two  properties  of  Andrews  plrts  me  interesting.  First,  because  of  the  Fourier 
series  interpretation,  the  classic  Parseval's  Theorem  holds.  Parseval's  Theorem  basically  has  to  do  with  L2- 
nomts  and  asserts  that  mean  square  error  in  the  Fourier  domain  and  rrrean  square  error  in  the  untransformed 
domain  are  the  same.  Thus  while  the  untransformed  domain  is  n-dimensional  Euclidian  space,  the  Fourier 
domain  is  2-dimensional  space  so  that  by  locrfdng  at  an  Andrews  jrfot  we  can  visually  get  an  idea  of  the  mean 
square  error  structure.  The  second  prc^ierty  relates  to  the  fact  that  we  are  talkmg  about  orthonormal 
trigonometric  series.  Because  of  this  (thinking  of  the  x-axis  variable  as  an  angle,  say  9),  for  every  9  we  get  a 
different  linem  weighting  of  the  XjS.  We  can  think  of  a  slice  at  5  as  a  1-dimensional  projection  of  the 
multivariate  vector  onto  an  axis  whose  orientation  is  determined  by  9.  It  has  been  mgued  that  this  in  effect 
gives  us  a  one-dimensional  grand  tour.  As  with  any  grand  tour  this  offers  us  the  possibility  of  looking  for 
orientations  that  show  up  interesting  or  unusual  properties. 

There  is  nothing  inherently  sacred  dxmt  either  the  piecewise  linem  (parallel  coordinate  plot)  or  the 
trigonometric  (Andrews  plot)  interpolation.  The  former  is  useful  because  in  preserves  geometric  properties,  the 


latter  because  of  the  mean  square  interpretation.  The  1 -dimensional  grand  tour  would  work  with  anv' 
orthonormal  series  so  there  may  be  some  other  interesting  orthonormal  series  to  think  about.  It  may  be  that  we 
can  invent  series  which  highlight  different  properties  so  that  we  can  have  a  family  of  plots  tksigned  to  explore 
different  aspects  of  the  structure.  That  is  to  say,  if  we  are  interested  in  highlighting  clustering  or  outliers,  we 
propose  to  invent  an  orthogonal  series  that  would  exaggerate  those  aspects  of  the  data  in  the  plot.  Thus,  we 
could  generalize  the  parallel  coordinate  and  Andrews  plots. 

Our  work  in  this  area  was  published  in  Wegman  and  Shen  (1993)  and  also  described  in  Wegman,  Carr 
and  Luo  (1993),  and  Wegman  and  Carr  (1993).  One  key  result  was  to  do  an  expansion  in  two  dimensions 
instead  of  just  one.  What  I  have  described  before  is  an  expansion  f(6;  X)  where  X=(Xi,.  .  ,?C)  where  f  is  either 

a  piecewise  linear  interpolant  or  a  trigonometric  series.  I  used  a  bivariate  expansion  say  f{0,  X)  =  (fi(0), 
f2(0))  as  a  2-dimensional  Fourier  transform  with  irrational  phase  ratio  (or,  in  &ct,  any  orthonormal  series).  In 
this  situation  I  was  able  to  preserve  the  Parseval-type  property  and  create  the  two-dimensional  pseudo-grand 
tour.  We  think  of  a  3-dimensional  plot,  plotting  f  {d)  against  6.  If  the  6  axis  corresponds  to  the  x  axis  and 
f  to  the  y-z  axis,  we  implemented  this  in  our  VR  lab  with  rotation  around  the  x-axis  to  help  visualize  the  three- 
dimensional  structure.  Having  a  three-dimensional  plot  helps  uncover  more  structure  in  the  data  than  a  simple 
twoKlimensional  plot  would.  Moreover,  we  are  able  to  rotate  the  plrt  so  that  the  y-z  axis  is  the  screen  axis. 
Then  slicing  this  graph  along  the  x-axis  would  correspond  to  doing  a  two-dimensional  grand  tour.  This 
provided  a  unified  treatment  of  Andrews/parallel-coordinate-type  plots  with  the  grand  tour  idea. 

The  Grand  Tour  in  Three  and  Four  Dimensions 

The  grand  tour  is  a  very  interesting  idea  first  and  primarily  exposited  by  Asimov,  and  never  really 
given  its  full  due  we  believe  because  it  is  computationally  intensive  and  technically  fairly  difficult.  The  intuitive 
idea  underlying  the  grand  tour  is  as  its  name  suggests,  if  we  want  to  investigate  a  data  set  we  lordc  at  it  flrom  all 
angles"  much  as  if  we  were  doing  a  grand  tour  of  a  geographic  place  we  would  tty  and  look  at  it  in  all  aspects. 
Thus,  for  example,  if  we  ate  exploring  a  ten-dimensional  data  set,  we  would  like  to  look  at  it  fi-om  as  man>' 
different  perspectives  as  we  could.  The  original  mathematical  implementation  was  as  prcyecttons  into  two- 
dimensional  planes  (flats).  The  collection  of  all  two-dimensional  flats  in  an  n-dimensional  Euclidian  space  is 
called  a  Grassmanian  manifold.  The  idea  is  to  create  a  space  filling  path  (i.e.  one  that  visits  all  elements  of  the 
Grassmannian  manifold)  in  some  continuous  feshion  with  the  additional  restriction  that  the  proportion  of  time 
spent  in  each  region  is  proportional  to  the  size  of  that  region.  That  is  to  say  we  do  not  linger  in  a  small  region 
of  the  whole  space.  If  we  then  think  of  stepping  along  this  path,  we  get  a  series  of  2-dimensional  planes  onto 
which  we  can  project  the  n-dimensional  point  cloud.  If  there  is  no  structure  in  the  point  cloud,  then  every  two- 
dimensional  projection  should  look  like  an  uncorielated  scatter  diagram  If  there  is  (two-dimensional)  structure, 
then  some  projections  will  have  interesting  non-tnvial  patterns  and  ttese  can  be  modeled.  Two  problems  arise. 
FiisL  if  the  dimension  of  the  data  space  is  high,  then  the  number  of  two-flats  needed  to  get  a  reasonably  dense 
collection  of  two  planes  is  very  large.  This  means  that  in  any  teal  implementation  there  is  a  tradeoff  between 
density  of  planes  and  reasonable  viewing  time.  However,  if  the  density  of  viewing  planes  is  fairly  low,  some 
perspectives  will  be  missed  and  consequently  some  interesting  projections  may  be  lost.  Second,  the  methods  for 
choosing  the  path  through  the  Grassmannian  manifold  are  either  conqmtationally  very  tedious  or  not 
mathematically  elegant  and  visually  unappealing.  Moreover,  even  if  these  a^iects  could  be  dramatically 
improved,  it  is  clear  that  looking  a  sequence  of  two-dimensional  projections  will  allow  us  to  detect  unusual  two- 
dimensional  patterns,  but  it  will  not  necessarily  allow  for  us  to  detect  unusual  patterns  in  3  dimensions. 

The  fact  that  we  have  3-dimensional  display  devices  suggests  that  we  could  and  have  tried  creating  a 
grand  tour  in  three  dimensions.  The  idea  is  in  an  n-dimensional  space  there  would  be  a  large  number  number 
of  three-dimensional  subspaces.  Instead  of  stepping  through  a  sequence  of  two-flats,  we  could  step  through  a 
sequence  of  three-flats.  There  are,  of  course,  as  marry  two-dimensional  flats  (coordinate  systems)  as  there  are 
three-dimensional  flats  (coordinate  systems)  in  the  sense  that  both  have  the  same  cardinahty  and  are 
uncountably  infinite  Nonetheless,  in  a  practical  implementation,  we  do  not  have  to  step  through  as  many  3-D 
coordinate  systems  as  2-D  coordinate  systems  in  order  to  densely  approximate  all  possibly  systems.  In  the  two- 
dimensional  grand  tour  we  are  interested  in  determining  two-flats.  These  will  be  determined  by  a  pair  of  unit 
length  vectors,  say  (a,  b),  which  are  orthogonal  and  which  span  a  given  plane.  Of  course,  if  each  of  the 


components  of  (a,  b)  contain  only  Os  and  Is,  these  will  correspond  to  planes  of  the  original  coordinate  axes 
system.  Thus  the  2-flat  of  interest  is  span(a,  b).  We  have  achieved  two  important  results:  1)  We  have 
generalized  the  grand  tour  to  general  k-dimensional  representations,  i.e.  we  have  created  a  time-dependent 
series  of  orthonormal  vectors  in  k-dimensions,  (ai(t),  a2(t), . . . ,  ajt(t))  (see  description  below)  and,  2)  We  have 
found  a  computationally  eflBcient  algorithm  for  a  2-dimensional  pseudo-grand  tour  (see  description  above). 
These  results  were  reported  in  Wegman  (1991b),  Wegman  and  Shen  (1993),  Wegman,  Carr  and  Luo  (1993)  and 
Wegman  and  Carr  (1993). 

Finding  Structure  in  k-Dimensions  using  Grand  Tour  and  Parallel  Coordinates 

The  project  here  is  conceptually  closely  related  to  our  earlier  discussions  of  the  grand  tour.  As 
indiratpd  earlier,  the  advantage  of  doing  a  3-dimensional  grand  tour  is  two-fold.  FirsL  it  allows  for  one  to  see 
unusual  3 -dimensional  configurations  instead  of  simply  unusual  2-dimensional  configurations.  Second,  it 
allows  a  more  complete  search  of  the  k-dimensional  space  because,  for  practical  purposes,  there  are  fewer  3-flats 
needed  than  2-flats  to  attain  the  same  density.  Because  parallel  coordinates  is  a  convenient  tool  for  representing 
data  in  dimensions  4  and  higher,  a  natural  suggestion  is  to  combine  the  parallel  coordinate  rqjresentation  with 
the  grand  tour  notion.  Generalizing  our  earlier  notion,  su{^x>se  (ai,  82,  - . .  ,  a*)  is  a  vector  of  k-unit  vectors 
which  form  the  mutually  orthogonal  unit  vectors  whose  span  is  a  k-flat.  This  is,  so  to  speak,  a  Grassmaimian 
manifold  of  k-flats  inst^  of  2-flats.  The  idea  is  to  find  a  continuous,  dense  path  through  this  Grassmannian 
manifold  and  use  the  k-flats  (k-dimensional  coordinate  system)  so  generated  as  a  sequence  of  coordinate  systems 
in  which  we  can  plot  the  data.  Of  course,  we  would  not  plot  using  Cartesian  coordinates,  but  we  can  plot  using 
parallel  coordinates.  Again  we  would  be  searching  for  unusual  structure.  One  structure  that  would  be  of 
irrterest  finding  that  the  data  lie  on  one  or  more  k-flats  or  other  k-manifolds.  For  example,  verifying  that  the 
data  were  c<Kk-)planar  in  some  orientation  of  a  k-flat  would  essentially  suggest  that  a  nniltiple  linear  regression 
with  1  dependent  and  (k  -  1)  independent  variables  is  an  ^ropriate  model.  Other  structures  would  suggest 
other  models. 

The  trade-off  is  obvious.  As  k  gets  larger,  the  ability  to  look  for  unusual  higher  dimensional  structure 
improves.  Also  the  density  of  k-flats  is  much  high  than  the  density  of  2-  or  3-flats  and  so  it  appears  plausible 
that  we  could  look  more  closely  at  the  n-dimensional  space.  The  bad  news  is  that  the  computation  of  the  unit 
vectors  (ai,  82, . . .  ,  a*)  is  likely  to  become  computationally  more  intensive.  How  bad  this  might  be  is  not  yet 
clear. 

Let  us  consider  a  related  <*servation.  We  know  that  lines  in  parallel  coordinates  represent  points  in 
Euclidian  ^tace  and  similarly,  points  in  parallel  coordinates  represent  lines  in  Euclidian  space.  Suppose  we 
have  a  buiKh  of  points  in  Euchdian  ^>ace  chosen  randomly  except  that  they  all  lie  on  a  plane,  say  a  d-flat.  They 
are  represented  by  a  collection  of  line  segments  joining  parallel  coordinate  axes.  Let's  let  the  rth  point  in 
Euclidian  n-^ce  be  represented  by  and  let  the  line  between  axis y  andy  -(-  7  be  £},  j  =  1,  2, . . .  ,  n  -  1.  The 
intersection  of  and  jCj  is  a  point  in  parallel  coordinate  space  representing  a  line  in  Euclidian  space  denote  it 
by  7^*.  Joining  this  to  '^+1  gives  us  a  new  line  segment  in  parallel  coordinate  space,  say  Cf,  which  represents 
a  point  in  Euclidian  n-space.  SiiKe  the  lines  represented  by  7^*  are  coplanar,  their  intersections  represented  by 
are  also  crqtlanar.  This  implies  that  all  of  the  segments  Cf  should  have  a  common  intersections  as  j  ranges 
from  1  to  n.  Indeed,  if  there  is  not  one  but  several  intersections  for  each  j,  this  suggests  that  there  are  not  one 
but  several  plaiKS.  Generalizing  this  process  to  higher  dimensions  this  suggests  another  diagnostic  tool  for 
detecting  when  a  point  cloud  lies  on  one  or  mote  k-flats.  Coupled  with  the  k-dimensional  grand  tour,  this  may 
be  a  very  powerful  geometric  diagnostic  tool  for  iirferring  data  structure  in  higher  dimensions. 

A  related  problem  is  to  diagrtose  nonlinear  structure.  If  we  have  data  on  a  nonlinear  k-manifold,  the 
given  technique  may  not  be  entirely  appropriate.  This  technique  is  fairly  robust  to  variability  to  some  scatter  off 
of  the  plane  (i.e.,  when  dealing  with  a  thick  slab).  If  so,  then  a  k-manifold  which  has  small  to  moderate 
curvature  may  be  regarded  as  a  modestly  thick  slab  and  although  the  will  not  have  exactly  common 
intersections  the  intersections  should  cluster  tightly.  The  idea  is  then  to  introduce  nonlinear  transformations  of 
the  data  and  look  at  the  plot  (ff  the  intersections  as  a  graphical  tool  for  diagnosing  how  well  the  transformation 
is  linearizing  the  data  fit.  Of  course,  if  the  k-manifold  is  highly  curved,  there  may  not  be  any  indication  of 


planarity.  This  work  was  reported  in  Wegman  (1991b)  and  has  been  coded  into  a  software  package  titled 
ExplorN  which  is  co-authored  by  Carr,  Luo,  Wegman  and  Shen. 

Structural  Inference  using  Rid^  Estimation  in  Hyperspace 

This  problem  arises  from  an  attempt  to  abstract  the  general  idea  of  nonparametric  regression.  The  idea 
of  regression,  erf  course,  is  that  there  is  a  response  variable,  say  Y,  and  one  or  more  predictor  variables,  say  Xj, 

,  Xd.  In  regression  we  attempt  to  find  a  fimetion,  say  f,  so  that  Y  is  approximated  by  f(Xi, ...  ,  X^)  in  some 
sense,  usually  least  squares.  This  gives  the  random  variable  Y  some  sort  of  preferred  status  over  the  variables 
Xi,  . . .  ,  X<i.  This  may  or  may  not  be  appropriate.  We  can  however  think  of  the  variables  Y,  Xi,  . . .  ,  X,i  as  a 
vector  which  describe  a  point  in  a  d+1  dimensional  space.  These  points  satisfy  some  functional  relationship, 
that  is  there  exists  a  function,  saj'  F,  such  that  F(Y,  Xj,  . . .  =  0.  Another  way  of  thinking  about  this  is 

geometrically,  i.e.  m  =  {(Y,  Xi, . . .  ,  X^):  F(Y,  Xj, . . .  .  X.)  -  0}  is  some  sort  of  hypersurface  of  dimension 
k  ptntwtHfvl  in  a  d  dimensional  space.  To  make  this  concrete  by  an  example,  let  d  =  2,  k  =  1  and  F(Y, 
X)  =  Y  -  sin(X).  Then  the  points  (Y,  X)  in  art  are  exactly  the  points  in  two  dimensions  lying  on  the 
Y  =  sin(X)  curve,  art  is  a  one-dimensional  set  in  a  two-dimensional  space.  Because  we  are  dealing  with 
random  variables  we  cannot  expect  the  points  to  lie  exactly  on  the  hypersurface,  art,  (technically  art  should  be 
called  an  algdiraic  manifold),  but  to  be  scattered  off  of  it.  Thus  we  should  really  think  about  F(Y,  Xi, . . .  ,X^) 
=  e  so  that  taking  expected  values  we  find  that  art  =  {(Y.  Xj.  ,  Xi):  E  F(Y,  Xi,  ...  ,  Xi)  =  0}  is  the 
manifold  we  would  like  to  estimate.  Notice  in  the  regression  case  if  we  let  F(Y,  Xi, . . .  ,  Xi)  =  Y  -  f(Xi, . . .  , 
Xd),  then  F(Y,  Xi,  . . .  ,  Xi)  =  e  corresponds  to  Y  =  ftX. .  .  X;)+e  In  general,  in  this  description  we  have 

left  Y  in  to  draw  the  analogy  to  usual  regression,  but  Y  is  not  intended  to  have  a  preferred  status.  Thus  from 
now  on  we  shall  simply  consider  F(Xi,  ...  ,  X^)  and  define  art  as  {(Xj,  ...  ,  Xi):  E  F(Xi,  ...  ,  Xi)  =  0}. 
Thus  finding  the  functional  relationship  among  the  X’s  (i  c  in  my  language  structural  inference)  is  equivalent  to 
estimating  the  manifold,  art.  Since  art  is  a  geometric  structure  in  hyperspace,  we  have  the  potential  of 
visualizing  it  through  some  of  our  graphical  techniques 

We  suggest  a  connection  with  probability  densities.  Consider  a  plot  of  a  two-dimensional  normal 
density.  In  general  this  will  be  a  suifece  in  three  dimensions.  If  we  try  to  think  of  the  best  zero-dimensional 
summary  of  the  density  most  pet^le  would  probably  suggest  the  mean.  Since  the  mean  and  mode  of  the  normal 
are  co-located,  this  wwild  also  be  the  mode.  Let  me  use  langua^  which  suggests  a  solution  for  higher 
dimensions.  The  best  zero-dimensional  summary  is  location  of  mode  which  is  the  projection  of  the  maximal 
zero-dimensional  manifold  on  the  sur&ce  of  the  density  If  we  try  and  think  of  the  best  one-dimensional 
summary,  think  of  the  feet  that  slices  of  the  density  parallel  to  the  X  —  Y  have  elliptical  cross  section  with  a 
majnr  axis  and  a  minor  axis.  C^ierationally  we  would  probably  want  to  choose  our  summary  as  the  major  axis 
of  the  density.  Notice  if  the  cross  section  were  circular  correlation  would  be  1  and  there  would  be  no  difference 
between  major  and  minor  axes.  Basically  it  would  not  make  sense  to  talk  about  a  best  one-dimensional 
summary.  Ift  however,  the  correlation  were  plus  or  minus  I .  the  minor  axis  would  have  zero  length  and  the 
major  axis  would  coincide  with  the  usual  regression  line  (Because  of  perfect  correlation  there  would  be  no 
scatter  off  of  this  line.)  If  we  think  of  the  ridge  on  the  density  surface  (ridge  in  the  intuitive  sense  like  on  a 
mountain  or  hill)  the  major  axis  will  lie  beneath  this  ndge  In  some  sense,  the  ridge  we  have  just  described  is 
the  Tnayitnal  cme  dimensional  manifold  on  the  surfeoc  of  the  two-dimensional  density.  The  best  1-dimensional 
manifold  estimate  is  the  support  of  the  ridge,  i.e.  the  closure  of  the  set  of  points  for  which  the  ridge  is  positive. 
The  idea  in  general  is  to  find  the  maximal  k-dimensional  manifold  on  the  d-dimensional  surfece  of  the  density 
which  we  will  dffino  as  the  k-ridge.  The  k-skeleton  is  k-dimcnsional  manifold  which  is  the  support  of  the  k- 
ridge  in  the  d-dimensional  space.  The  research  problems  was  to  construct  a  suitable  definition  of  the  k-ridge 
and  to  construct  reasonable  estimators.  A  potentially  reasonable  estimation  procedure  for  the  k-ridge  is  to 
estimate  the  prob^ility  density  fimetion  and  find  the  maximal  k-ridge  on  it.  Another  element  of  the  research 
was  to  implement  a  3-dimensional  surfece  projection  of  the  k-skeleton  for  k  =  2  or  3  either  on  the  Silicon 
Graphics  machine  using  our  VR  immersive  technology.  The  0-skeleton  is  the  mode.  These  other  estimators  are 
multidimensional  analogues  of  the  mode.  This  work  has  been  reported  in  Wegman,  Carr  and  Luo  (1993)  and  in 
numerous  invited  presentation.  The  completed  research  will  form  the  substance  of  the  dissertation  of  our  Ph.D. 
student,  Qiang  Luo.  Mr.  Luo  will  be  awarded  his  Ph.D.  in  May,  1995.  The  woik  has  also  been  made  available 
in  software  entitled,  MasonKdge,  authored  by  Luo  and  Wegman. 


Other  Work 

The  four  tqjic  areas  described  above  were  the  topics  outlined  in  the  research  proposal  upon  which  the 
award  was  made.  However,  there  have  been  an  extensive  amount  of  additional  work  produced  under  this 
contract.  This  additional  work  generally  fells  into  the  categories  of;  1)  nonparametric  density  and  function 
estimation  (Le  and  Wegman,  1991;  Miller  and  Wegman,  1991;  Heame  and  Wegman,  1991;  Heame  and 
Wegman,  1992;  Le  and  Wegman,  1993,  Heame,  1994;  Marchette  et  al.  1994;  Solka  et  al.  1994a;  Heame  and 
Wegman,  1994  and  Solka  et  al.  1994b),  2)  parallel  and  high  performance  computing  in  statistics  (Wegman, 
1991a;  Xu,  Miller  and  Wegman,  1991;  SulUvan  and  Wegman,  1994;  Poston  and  Solka,  1994;  Wegman  and 
Jones,  1994;  Takacs,  Wegman  and  Wechsler,  1994;  Fauntleroy  and  Wegman,  1994;  Wegman,  1994;  Sullivan, 
1994;  and  Sullivan  and  Wegman,  1995),  3)  stochastic  modeling  (  Wegman  and  Habib,  1992;  and  Chow,  1994) 
and,  finally,  4)  historical  (Wegman,  1992;  Wegman,  1993). 
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